Stationary optimal process in discounted dynamic programming

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Constrained Discounted Dynamic Programming

This paper deals with constrained optimization of Markov Decision Processes with a countable state space, compact action sets, continuous transition probabilities, and upper semi-continuous reward functions. The objective is to maximize the expected total discounted reward for one reward function, under several inequality constraints on similar criteria with other reward functions. Suppose a fe...

متن کامل

Determining Optimal Stationary Strategies for Discounted Stochastic Optimal Control Problem on Networks

The stochastic version of discrete optimal control problem with infinite time horizon and discounted integral-time cost criterion is considered. This problem is formulated and studied on certain networks. A polynomial time algorithm for determining the optimal stationary strategies for the considered problems is proposed and some applications of the algorithm for related Markov decision problem...

متن کامل

Smooth Value and Policy Functions for Discounted Dynamic Programming

We consider a discounted dynamic program in which the spaces of states and actions are smooth (in a sense that is suitable for the problem at hand) manifolds. We give conditions that insure that the optimal policy and the value function are smooth functions of the state when the discount factor is small. In addition, these functions vary in a Lipschitz manner as the reward function-discount fac...

متن کامل

Discounted Optimal Stopping Problems for the Maximum Process

The maximality principle [6] is shown to be valid in some examples of discounted optimal stopping problems for the maximum process. In each of these examples explicit formulas for the value functions are derived and the optimal stopping times are displayed. In particular, in the framework of the Black-Scholes model, the fair prices of two lookback options with infinite horizon are calculated. T...

متن کامل

Gaussian process dynamic programming

Reinforcement learning (RL) and optimal control of systems with continuous states and actions require approximation techniques in most interesting cases. In this article, we introduce Gaussian process dynamic programming (GPDP), an approximate value-function based RL algorithm. We consider both a classic optimal control problem, where problem-specific prior knowledge is available, and a classic...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applicationes Mathematicae

سال: 1977

ISSN: 1233-7234,1730-6280

DOI: 10.4064/am-15-4-475-487